Search CORE

12 research outputs found

Penalized model-based clustering with cluster-specific diagonal covariance matrices and grouped variables

Author: Pan Wei
Shen Xiaotong
Xie Benhuai
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2008
Field of study

Clustering analysis is one of the most widely used statistical tools in many emerging areas such as microarray data analysis. For microarray and other high-dimensional data, the presence of many noise variables may mask underlying clustering structures. Hence removing noise variables via variable selection is necessary. For simultaneous variable selection and parameter estimation, existing penalized likelihood approaches in model-based clustering analysis all assume a common diagonal covariance matrix across clusters, which however may not hold in practice. To analyze high-dimensional data, particularly those with relatively low sample sizes, this article introduces a novel approach that shrinks the variances together with means, in a more general situation with cluster-specific (diagonal) covariance matrices. Furthermore, selection of grouped variables via inclusion or exclusion of a group of variables altogether is permitted by a specific form of penalty, which facilitates incorporating subject-matter knowledge, such as gene functions in clustering microarray samples for disease subtype discovery. For implementation, EM algorithms are derived for parameter estimation, in which the M-steps clearly demonstrate the effects of shrinkage and thresholding. Numerical examples, including an application to acute leukemia subtype discovery with microarray gene expression data, are provided to demonstrate the utility and advantage of the proposed method.Comment: Published in at http://dx.doi.org/10.1214/08-EJS194 the Electronic Journal of Statistics (http://www.i-journals.org/ejs/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

Crossref

Variable selection in penalized model-based clustering via regularization on grouped parameters

Author: Benhuai Xie
Wei Pan
Xiaotong Shen
Publication venue
Publication date: 01/01/2008
Field of study

Summary: Penalized model-based clustering has been proposed for high-dimensional but small sample-sized data, such as arising from genomic studies; in particular, it can be used for variable selection. A new regularization scheme is proposed to group together multiple parameters of the same variable across clusters, which is shown both analytically and numerically to be more effective than the conventional L1 penalty for variable selection. In addition, we develop a strategy to combine this grouping scheme with grouping structured variables. Simulation studies and applications to microarray gene expression data for cancer subtype discovery demonstrate the advantage of the new proposal over several existing approaches

CiteSeerX

Functional group-based linkage analysis of gene expression trait loci-3

Author: Baolin Wu (20390)
Benhuai Xie (46488)
Guanghua Xiao (46490)
Na Li (6550)
Peng Wei (46487)
Wei Pan (701)
Yang Xie (46489)
Publication venue
Publication date
Field of study

Phosphoinositide-mediated signaling) and 5 (regulation of cyclin dependent protein kinase activity).Copyright information:Taken from "Functional group-based linkage analysis of gene expression trait loci"http://www.biomedcentral.com/1753-6561/1/S1/S117BMC Proceedings 2007;1(Suppl 1):S117-S117.Published online 18 Dec 2007PMCID:PMC2367612.</p

FigShare

Functional group-based linkage analysis of gene expression trait loci-0

Author: Baolin Wu (20390)
Benhuai Xie (46488)
Guanghua Xiao (46490)
Na Li (6550)
Peng Wei (46487)
Wei Pan (701)
Yang Xie (46489)
Publication venue
Publication date
Field of study

Ed signaling; 3) GTP biosynthesis; 4) purine nucleotide biosynthesis; 5) regulation of cyclin dependent protein kinase activity; 6) meiosis; 7) mRNA-nucleus export; 8) cholesterol metabolism; 9) biosynthesis; and 10) epidermis development.Copyright information:Taken from "Functional group-based linkage analysis of gene expression trait loci"http://www.biomedcentral.com/1753-6561/1/S1/S117BMC Proceedings 2007;1(Suppl 1):S117-S117.Published online 18 Dec 2007PMCID:PMC2367612.</p

FigShare

Functional group-based linkage analysis of gene expression trait loci-4

Author: Baolin Wu (20390)
Benhuai Xie (46488)
Guanghua Xiao (46490)
Na Li (6550)
Peng Wei (46487)
Wei Pan (701)
Yang Xie (46489)
Publication venue
Publication date
Field of study

FigShare

Pairwise correlations of the ten functional groups with highest mean heritability

Author: Baolin Wu (20390)
Benhuai Xie (46488)
Guanghua Xiao (46490)
Na Li (6550)
Peng Wei (46487)
Wei Pan (701)
Yang Xie (46489)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Functional group-based linkage analysis of gene expression trait loci"http://www.biomedcentral.com/1753-6561/1/S1/S117BMC Proceedings 2007;1(Suppl 1):S117-S117.Published online 18 Dec 2007PMCID:PMC2367612.</p

FigShare

Penalized mixtures of factor analyzers with application to clustering high-dimensional microarray data

Author: Baek
Baek
Beer
Benhuai Xie
Dempster
Eisen
Everitt
Fan
Fraley
Fraley
Ghahramani
Golub
Hinton
Huang
Hubert
McLachlan
McLachlan
McLachlan
McLachlan
McLachlan
Pan
Raftery
Rand
Thalamuthu
Wang
Wei Pan
Xiaotong Shen
Xie
Xie
Yuan
Yuan
Zhou
Publication venue: Oxford University Press
Publication date: 20/03/2010
Field of study

Motivation: Model-based clustering has been widely used, e.g. in microarray data analysis. Since for high-dimensional data variable selection is necessary, several penalized model-based clustering methods have been proposed tørealize simultaneous variable selection and clustering. However, the existing methods all assume that the variables are independent with the use of diagonal covariance matrices

CiteSeerX

Crossref

PubMed Central